United States Patent and Trademark Ofhce 



UNITED STATES DEPARTMENT OF COMMERCE 
United States Patent and Trademark OtBce 

Address: COMMISSIONER FOR PATENTS 



APPLICATION NO. 



FILING DATE 



FIRST NAMED INVENTOR 



ATTORNEY DOCKET NO. CONFIRMATION NO. 



10/629,486 



07/29/2003 



46363 7590 08/24 

WALL & TONG, LLP/ 
alcatel-lucent USA INC. 
595 SHREWSBURY AVENUE 
SHREWSBURY, NJ 07702 



Ben 2-16-1-10 



SAINT CYR, LEONARD 



PAPER NUMBER 



DELIVERY MODE 



Please find below and/or attached an Office communication concerning this application or proceeding. 

The time period for reply, if any, is set in the attached communication. 



PTOL-90A (Rev. 04/07) 



KJtSiVrXS nvrliyjts OUff Iff fcff Jr 


Application No. 

10/629,486 


Applicant(s) 

BEN ETAL. 


Examiner 

LEONARD SAINT CYR 


Art Unit 

2626 





- The MAILING DATE of this communication appears on the cover sheet with the correspondence address — 
Period for Reply 



A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) OR THIRTY (30) DAYS, 
WHICHEVER IS LONGER, FROM THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 133). 
Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
eamed patent term adjustment. See 37 CFR 1 .704(b). 

Status 

1 )^ Responsive to communication(s) filed on 28 May 2009 . 
2a )^ This action is FINAL. 2b)n This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 11, 453 O.G. 213. 

Disposition of Claims 

4) ^ Clalm(s) 1-19. 21-30. 32 - 37. 41 -45 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) 0 Claim(s) is/are allowed. 

6) |EI Claim(s) 1-19. 21-30. 32-37.41-45 is/are rejected. 
/)□ Claim(s) is/are objected to. 

8) 0 Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) 0 The specification is objected to by the Examiner. 

10)^ The drawing(s) filed on 29 July 2003 is/are: a)^ accepted or b)^ objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held In abeyance. See 37 CFR 1.85(a). 

Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 
1 !)□ The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-152. 

Priority under 35 U.S.C. § 119 

12)0 Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 119(a)-(d) or (f). 
a)n All b)n Some * c)^ None of: 

1 .□ Certified copies of the priority documents have been received. 

2. n Certified copies of the priority documents have been received in Application No. . 

3. n Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 



Attach ment(s) 

1 ) □ Notice of References Cited (PTO-892) 4) □ Interview Summary (PTO-41 3) 

2) □ Notice of Draftspereon's Patent Drawing Review (PTO-948) Paper No(s)/IVIail Date. 

3) □ Information Disclosure Statement(s) (PTO/SB/08) 5) □ Notice of Informal Patent Application 

Paper No(s)/Mail Date . 6) □ Other: . 



PTOL-T26'(Rev^'o8-0^^ 



Office Action Summary 



Part of Paper No./Mail Date 20090819 



Application/Control Number: 10/629,486 
Art Unit: 2626 



Page 2 



DETAILED ACTION 

Response to Arguments 

1 . Applicant's arguments filed 05/28/09 have been fully considered but they are not 

persuasive. 

Applicant argues that Weare et al., do not fairly suggest at least two subsets of a 
media program (Amendment, page 13). 

The examiner disagrees, since Weare et al., disclose "collection of media 
entities, such as media entities (subsets of a media program) that are audio files, 

or have portions that are audio files" (col.5, lines 15 - 22). 

Applicant argues that neither Weare et al., nor McEachern, nor Logan fairly 
suggest grouping ones of said coefficients of said second representation to form 
segment; storing at least 30 minutes worth of segments; and selecting a plurality of 
segments (Amendment, page 14). 

The examiner disagrees, since Logan discloses "the feature vectors 
corresponding to the sequence of frames are organized into segments. For 
example, contiguous sequences of feature vectors may be combined into 
corresponding segments that are each of 1 second duration. Assuming the frames 
are 25 ms long and overlap each other by 12.5 ms, as described above, there will be 
approximately 80 feature vectors per segment. Obviously, segments of sizes other 
than 1 second may be utilized. By identifying those segments of the audio input 
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that share similar cepstral features, the system has been able to automatically 
decipher the song's structure" (col.5, lines 4 - 35, col.6, lines 53 - 56). Combining 
contiguous sequences of feature vectors into corresponding segments that are 
each of 1 second duration suggests storing at least 30 minutes worth of segments; 
and selecting a plurality of segments, since segments of sizes other than 1 second 
may be utilized; and segments of the audio input that share similar cepstral 
features are identified. 



Claim Rejections - 35 USC §112 

2. The following is a quotation of the first paragraph of 35 U.S.C. 1 12: 

The specification shall contain a written description of the invention, and of the manner and process of 
making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the 
art to which It pertains, or with which It Is most nearly connected, to make and use the same and shall 
set forth the best mode contemplated by the Inventor of carrying out his Invention. 

3. Claims 22, 23, 35, and 36 are rejected under 35 U.S.C. 112, first paragraph, as 
failing to comply with the written description requirement. The claim(s) contains subject 
matter which was not described in the specification in such a way as to reasonably 
convey to one skilled in the relevant art that the inventor(s), at the time the application 
was filed, had possession of the claimed invention. The invention, as described in the 
specification, page 7, and Figs. 1 - 6, does not show any means for grouping the 
coefficient of the second representation; means for searching a database for 
substantially matching segments; and means for determining whether said subsequent 
media program subset exhibits similarities to said initial media program subset. 
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Claim Rejections - 35 USC § 103 

4. The text of those sections of Title 35, U.S. Code not included in this action can 
be found in a prior Office action. 

Claims 1 - 19, 21 - 37, and 41 - 45 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Wears et al., (US Patent 7,065,416) in view of McEachern (US 
Patent 5,615,302), and further in view of Logan et al., (US patent 6,633,845). 

Regarding claims 1 , 24, 34, Weare et al. discloses a method for program content 
identification (see col. 6, lines 22-27), said method comprising the steps of: 

for each of at least two media program subsets, performing the steps of (col.5, 
lines 15-22): 

filtering each first frequency domain representation of blocks of said media 
program subset using a plurality of filters to develop a respective second frequency 
domain representation of each of said blocks of said media said second frequency 
domain representation of each of said blocks having a reduced number of frequency 
coefficients with respect to said first frequency domain representation program (see col. 
16, lines 47, fig. 7, element 750, describing a critical band filtering step which can be 
modeled as a filter bank, thus indicating that a plurality of filters exist). 

However, Weare et al., do not specifically teach that said plurality of filters have 
center frequencies logarithmically spaced apart from each other with a logarithmic 
additive factor of 1/12; grouping frequency coefficients of said second frequency domain 
representation of said blocks to form segments and selecting a plurality of said 
segments; comparing selected segments to features of stored programs to identify 
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thereby said media program subset; determining whether said subsequent media 
program subset exhibits similarities to said initial media program subset. 

McEachern teaches this 1/12 octave filter center frequency spacing results in 
logarithmically spaced filters that are very closely centered at the frequencies of the 
linearly spaces harmonics (col. 12, line 66 - col. 13, line 2). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to use logarithm filters as taught by McEachern in Weare 
et al., because that would help extract the information content of audio signals (col.1 , 
lines 10-14). 

However, Weare et al., in view of McEachern do not specifically grouping 
frequency coefficients of said second frequency domain representation of said blocks to 
form segments and selecting a plurality of said segments; comparing selected 
segments to features of stored programs to identify thereby said media program subset; 
determining whether said subsequent media program subset exhibits similarities to said 
initial media program subset. 

Logan et al., teach that the feature vectors corresponding to the sequence of 
frames are organized into segments. For example, contiguous sequences of 
feature vectors may be combined into corresponding segments that are each of 1 
second duration. The distortion between various segments of the song is measured in 
order to identify those segments that can be considered to the same and those that are 
dissimilar. By identifying those segments of the audio input that share similar cepstral 
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features, the system has been able to automatically decipher the song's structure (col. 5, 
lines 4 - 35, col.6, lines 53 - 56). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to group sequence of frames in segments as taught by 
Logan et a!., in Weare et al., in view of McEachern, because that would help identify 
specific songs (col.2, line 5). 

Regarding claim 2, Logan et al. further disclose that each grouping of frequency 

coefficients of said second frequency domain to form a segment represents blocks that 
are consecutive in time in said media program ("sequence of frames"; col.5, lines 5 - 
35). 

Regarding claim 3, Weare et al. in view of Logan et a!., further disclose that said 
plurality of filters are arranged in a group that processes a block at a time, the portion of 
said second frequency domain representation produced by said group for each block 
forms a frame, and wherein at least two frames are grouped to form a segment (Weare 
et al., see col. 18, Logan et al. col.5, lines 5 - 35). 

Regarding claim 4, Logan et al., further disclose that said selected segments 
correspond to portions of said media program that are not contiguous in time (col.6, 
lines 60-62). 
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As per claim 5, Logan et al., furtlier disclose that said plurality of filters includes 
at least a set of triangular filters (col.4, lines 39 - 47). 

As per claim 6, Logan et al., further disclose that said plurality of includes at least 
a set of log-spaced triangular filters (col.4, lines 39 - 47). 

Regarding claim 7, Weare et al. further disclose that the segments selected in 
said selecting step are those that have largest minimum segment energy (see col. 18, 
lines 10-15). 

Regarding claim 8, Weare et al. further disclose that the segments selected in 
said selecting step are selected in accordance with prescribed constraints (see col. 18, 
line 66 - col. 19 line 2, where only selecting peaks that last for more than specified 
number of frames prevents the peaks from being too close). 

Regarding claim 9, Logan et al., further suggest that the segments selected in 
said selecting step are selected for portions of said media program that correspond in 
time to prescribed search windows that are separated by gaps ("assuming the frames 
are 25 ms long and overlap each other by 12.5ms"; col.5, lines 5-12). 

Regarding claim 10, Weare et al. further disclose that the segments selected in 
said selecting step are those that result in the selected segments having a maximum 
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entropy over the selected segments (see col. 18, lines 12-15, where the most energetic 
peaks are chosen, thus choosing the most entropic peaks). 

Regarding claims 11-13, Weare et al. further suggest that the step of 
normalizing said frequency coefficients in said second frequency domain representation 
after performing said grouping step, said normalization being performed on a per- 
segment basis; wherein said normalization includes performing at least a preceding- 
time normalization; an L2 normalization ("normalizing the sum"; see col. 16, lines 3-6). 

Regarding claim 14, Weare et al. further disclose that the step of storing said 
selected segments in a database in association with an identifier of said media program 
(see col. 7, lines 59-65, where music is stored in a database and for generating play 
lists thus an identifier must be associated with the stored data). 

Regarding claim 15, Weare et al. further disclose that the step of storing in said 
database information indicating timing of said selected segments (see col. 9, lines 16- 
21 , where classifying the tempo in the database indicates timing of media segment). 

Regarding claim 16, Weare et al. further disclose that said first frequency domain 
representation of blocks of said media program is developed by the steps of: digitizing 
an audio representation of said media program to be stored in said database (see col. 
16, lines 41-44); dividing the digitized audio representation into blocks of a prescribed 
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number of samples (see col. 16, lines 41-44, where the audio representation is divided 
into frames); smoothing said blocks using a filter (see col. 16, lines 45-47); and 

converting said smoothed blocks into the frequency domain, wherein said 
smoothed blocks are represented by frequency coefficients (see col. 16, lines 39- 41). 

As per claim 17, Logan et al., further disclose a hamming window filter (col.4, 
lines 25 -27). 

Regarding claim 18, Weare et al. further disclose that each of said smoothed 
blocks are converted into the frequency domain in said converting step using a Fast 
Fourier Transform (FFT) (see col. 16, lines 39-41 and col. 23, lines 52-54). 

As per claim 19, Logan et al., further disclose converting step using a discrete 
cosine transform (col.4, line 49). 

Regarding claims 21 , and 37, Weare et al. discloses identification of content 
identification (see col. 6, lines 22-27), comprising: 

for each of at least two media program subsets, performing the steps of (col.5, 

lines 15-22): 

filtering each first frequency domain representation of blocks of said media 
program subset using a plurality of filters to develop a respective second frequency 
domain representation of each of said blocks of said media said second frequency 
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domain representation of eacli of said blocl<s having a reduced number of frequency 
coefficients with respect to said first frequency domain representation program (see col. 
16, lines 47, fig. 7, element 750, describing a critical band filtering step which can be 
modeled as a filter bank, thus indicating that a plurality of filters exist). 

However, Weare et al., do not specifically teach that said plurality of filters have 
center frequencies logarithmically spaced apart from each other with a logarithmic 
additive factor of 1/12; grouping frequency coefficients of said second frequency domain 
representation of said blocks to form segments; storing at least 30 minutes worth of 
segments; and selecting a plurality of said segments. 

McEachern teaches this 1/12 octave filter center frequency spacing results in 
logarithmically spaced filters that are very closely centered at the frequencies of the 
linearly spaces harmonics (col. 12, line 66 - col. 13, line 2). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to use logarithm filters as taught by McEachern in Weare 
et al., because that would help extract the information content of audio signals (col.1 , 
lines 10-14). 

However, Weare et al., in view of McEachern do not specifically grouping 
frequency coefficients of said second frequency domain representation of said blocks to 
form segments; storing at least 30 minutes worth of segments; and selecting a plurality 
of said segments. 

Logan et al., teach that the feature vectors corresponding to the sequence of 
frames are organized into segments. For example, contiguous sequences of feature 
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vectors may be combined into corresponding segments that are each of 1 second 
duration. Obviously, segments of sizes other than 1 second may be utilized. By 

identifying those segments of the audio input that share similar cepstral features, the 
system has been able to automatically decipher the song's structure (segments of sizes 
other than 1 second may be utilized suggests storing at least 30 minutes worth of 
segments; col. 5, lines 4 - 35, col.6, lines 53 - 56). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to group sequence of frames in segments as taught by 
Logan et al., in Weare et a!., in view of McEachern, because that would help identify 
specific songs (col.2, line 5). 

As per claim 22, Weare et al., teach an apparatus for program content 
identification comprising: 

a plurality of filters for filtering a first representation of a media program subset 
using frequency coefficient to develop a second representation of said media subset 
that has a reduced number of frequency coefficients with respect to said first 
representation for each of at least two media program subsets (see col. 16, lines 47, fig. 
7, element 750, describing a critical band filtering step which can be modeled as a filter 
bank, thus indicating that a plurality of filters exist). 

However, Weare et al., do not specifically teach that said plurality of filters have 
center frequencies logarithmically spaced apart from each other with a logarithmic 
additive factor of 1/12; means for grouping ones of said coefficients of said second 
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representation to form segments; means for storing at least 30 minutes wortli of 
segments; and means for selecting a plurality of said segments. 

McEachern teaches this 1/12 octave filter center frequency spacing results in 
logarithmically spaced filters that are very closely centered at the frequencies of the 
linearly spaces harmonics (col. 12, line 66 - col. 13, line 2). 

Therefore, It would have been obvious to one of ordinary skill In the art at the 
time the invention was made to use logarithm filters as taught by McEachern in Weare 
et al., because that would help extract the information content of audio signals (col.1 , 
lines 10-14). 

However, Weare et al., in view of McEachern do not specifically means for 
grouping ones of said coefficients of said second representation to form segments; 
means for storing at least 30 minutes worth of segments; and means for selecting a 
plurality of said segments. 

Logan et al., teach that the feature vectors corresponding to the sequence of 
frames are organized into segments. For example, contiguous sequences of feature 
vectors may be combined into corresponding segments that are each of 1 second 
duration. Assuming the frames are 25 ms long and overlap each other by 12.5 ms, as 
described above, there will be approximately 80 feature vectors per segment. 
Obviously, segments of sizes other than 1 second may be utilized. By Identifying 
those segments of the audio input that share similar cepstral features, the system has 
been able to automatically decipher the song's structure (segments of sizes other than 1 
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second may be utilized suggests storing at least 30 minutes worth of segments; col. 5, 
lines 4 - 35, col.6, lines 53 - 56). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to group sequence of frames in segments as taught by 
Logan et al., in Weare et al., in view of McEachern, because that would help identify 
specific songs (col.2, line 5). 

As per claim 23, Weare et al., teach an apparatus for program content 
identification comprising: 

filtering a first frequency domain representation of a media program subset using 
a plurality of filters to develop a second frequency domain representation of each of said 
subsets of said media program having a reduced number of frequency coefficients with 
in said second frequency domain representation with respect to said first frequency 
domain representation for each of at least two media program subsets (see col. 16, 
lines 47, fig. 7, element 750, describing a critical band filtering step which can be 
modeled as a filter bank, thus indicating that a plurality of filters exist). 

However, Weare et al., do not specifically teach means for filtering, wherein said 
plurality of filters have center frequencies logarithmically spaced apart from each other 
with a logarithmic additive factor of 1/12; means for grouping ones of said coefficients of 
said second representation to form segments; means for storing at least 30 minutes 
worth of segments; and means for selecting a plurality of said segments. 
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McEachern teaches this 1/12 octave filter center frequency spacing results in 
logarithmically spaced filters that are very closely centered at the frequencies of the 
linearly spaces harmonics (col. 12, line 66 - col. 13, line 2). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to use logarithm filters as taught by McEachern in Weare 
et al., because that would help extract the information content of audio signals (col.1 , 
lines 10-14). 

However, Weare et al., in view of McEachern do not specifically means for 
grouping ones of said coefficients of said second representation to form segments; 
means for storing at least 30 minutes worth of segments; and means for selecting a 
plurality of said segments. 

Logan et al., teach that the feature vectors corresponding to the sequence of 
frames are organized into segments. For example, contiguous sequences of feature 
vectors may be combined into corresponding segments that are each of 1 second 
duration. Assuming the frames are 25 ms long and overlap each other by 12.5 ms, as 
described above, there will be approximately 80 feature vectors per segment. 
Obviously, segments of sizes other than 1 second may be utilized. By identifying 
those segments of the audio input that share similar cepstral features, the system has 
been able to automatically decipher the song's structure (segments of sizes other than 1 
second may be utilized suggests storing at least 30 minutes worth of segments; col.5, 
lines 4 - 35, col.6, lines 53 - 56). 
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Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to group sequence of frames in segments as taught by 
Logan et al., in Weare et al., in view of McEachern, because that would help identify 
specific songs (col.2, line 5). 

Regarding claim 25, Weare et al., further disclose that the step of indicating that 
said media program cannot be identified when substantially matching segments are not 
found in said database in said searching step ("media entities that have... dissimilar"; 
Abstract). 

Regarding claim 26, Logan et al., further disclose that said data base includes 
information indicating timing of segments of each respective media program identified 
therein, and wherein a match may be found in said searching step only when the timing 
of said segments produced in said grouping step substantially matches the timing of 
said segments stored in said database ("similar cepstral features, the system has been 
able to automatically decipher the song's structure"; col.6, lines 53 - 56). 

Regarding claim 27, Weare et a!., further disclose that said matching between 
segments is based on the Euclidean distances between segments (col.1 1 , lines 15 - 
20). 
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Regarding claim 28, Weare et al., further disclose that the step of identifying said 
media program as being the media program indicated by the identifier stored in said 
database having a best matching score when substantially matching segments are 
found in said database in said searching step ("matching algorithm... confidence level"; 
col.8, lines 1 - 12). 

Regarding claim 29, Weare et al., further disclose that the step of determining a 
speed differential between said media program and a media program identified in said 
identifying step ("rate of speed"; col. 23, lines 1 - 5). 

Regarding claims 30, 32, and 33, Logan et al., in view of McEachern, and further 
in view of Weare et al., do not disclose wherein said matching score for a program 
P.sub.i is determined by 

wherein said determining step is based on an overlap score. 

However, since Weare et al., teach nearest neighbor and/or other matching 
algorithms may be utilized to locate songs that are similar... a confidence level for song 
classification may also be returned (col.8, lines 1 - 10). One having ordinary sl^ill in the 
at the time the invention was made would have found it obvious to use a matching score 
in Logan et al., in view of McEachern, and further in view of Weare et al., because that 
would help classify media entities (col.5, lines 7-12). 
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As per claim 35, Weare et al., teach an apparatus for program content 
identification comprising: 

filtering a first frequency domain representation of a media program subset using 
a plurality of filters to develop a second frequency domain representation of each of said 
subsets of said media program having a reduced number of frequency coefficients with 
in said second frequency domain representation with respect to said first frequency 
domain representation for each of at least two media program subsets (see col. 16, 
lines 47, fig. 7, element 750, describing a critical band filtering step which can be 
modeled as a filter bank, thus indicating that a plurality of filters exist). 

However, Weare et al., do not specifically teach means for filtering, wherein said 
plurality of filters have center frequencies logarithmically spaced apart from each other 
with a logarithmic additive factor of 1/12; means for grouping ones of said coefficients of 
said second representation to form segments; means for searching a database for 
substantially matching segments, said database having stored therein segments of 
media programs and respective corresponding program identifiers; and means for 
determining whether said subsequent media program subset exhibits similarities to said 
initial media program subset. 

McEachern teaches this 1/12 octave filter center frequency spacing results in 
logarithmically spaced filters that are very closely centered at the frequencies of the 
linearly spaces harmonics (col. 12, line 66 - col. 13, line 2). 
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Therefore, it would have been obvious to one of ordinary sl<ill in the art at the 
time the invention was made to use logarithm filters as taught by McEachern in Weare 
et al., because that would help extract the information content of audio signals (col.1 , 
lines 10-14). 

However, Weare et al., in view of McEachern do not specifically means for 
grouping ones of said coefficients of said second representation to form segments; 
means for searching a database for substantially matching segments, said database 
having stored therein segments of media programs and respective corresponding 
program identifiers; and means for determining whether said subsequent media 
program subset exhibits similarities to said initial media program subset. 

Logan et al., teach that the feature vectors corresponding to the sequence of 
frames are organized into segments. For example, contiguous sequences of feature 
vectors may be combined into corresponding segments that are each of 1 second 
duration. Assuming the frames are 25 ms long and overlap each other by 12.5 ms, as 
described above, there will be approximately 80 feature vectors per segment. 
Obviously, segments of sizes other than 1 second may be utilized. By identifying 
those segments of the audio input that share similar cepstral features, the system has 
been able to automatically decipher the song's structure (col.5, lines 4 - 35, col. 6, lines 
53 - 56). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to group sequence of frames in segments as taught by 
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Logan et al., in Weare et al., in view of McEachern, because that would help identify 
specific songs (col.2, line 5). 

Regarding claim 36, Weare et al., in view of Logan et al., further disclose that 
said first frequency domain representation of said media program comprises a plurality 
of blocks of coefficients corresponding to respective time domain sections of said media 
program and said second frequency domain representation of said media program 
comprises a plurality of blocks of coefficients corresponding to respective time domain 
sections of said media program (Logan et al; col.5, lines 5 - 35; Weare et al., col. 16, 
lines 33-36). 

As per claims 41 - 45, Weare et al., further disclose at least two of said media 
subsets are associated with the same media program; at least two of said media 
subsets are associated with different media program ("media entities that are audio files 
or have portions that are audio files'; Abstract). 

Conclusion 

5. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1 .136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
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mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to LEONARD SAINT CYR whose telephone number is 
(571 ) 272-4247. The examiner can normally be reached on Mon- Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is (571)- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or (571) 272-1000. 
LS 
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